Identify Gram-negative bacterial secreted protein types by incorporating different modes of PSSM into Chou's general PseAAC via Kullback-Leibler divergence

J Theor Biol. 2018 Oct 7:454:22-29. doi: 10.1016/j.jtbi.2018.05.035. Epub 2018 May 29.

Abstract

Gram-negative bacterial secreted proteins are crucial for bacterial pathogenesis by making bacteria interact with their environments. Therefore, identification of bacterial secreted proteins becomes a significant process for the research of various diseases and the corresponding drugs. In this paper, we develop a feature design model named ACCP-KL-NMF by fusing PSSM-based auto-cross correlation analysis for features extraction and nonnegative matrix factorization algorithm based on Kullback-Leibler divergence for dimensionality reduction. Hence, a 150-dimensional feature vector is constructed on the training set. Then support vector machine is adopted as the classifier, and the most objective jackknife test is chosen for evaluating the accuracy. The ACCP-KL-NMF model yields the approving performance of the overall accuracy on the test set, and also outperforms the other three existing models. The numerical experimental results show that our model is effective and reliable for identification of Gram-negative bacterial secreted protein types. Moreover, it is anticipated that the proposed model could be beneficial for other biology sequence in future research.

Keywords: Correlation analysis; Nonnegative matrix factorization; Position-specific scoring matrix; Secreted proteins; Support vector machine.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acids / metabolism
  • Bacterial Proteins / analysis*
  • Bacterial Proteins / metabolism*
  • Computational Biology / methods*
  • Gram-Negative Bacteria / metabolism*
  • Models, Biological
  • Secretory Pathway*
  • Software
  • Support Vector Machine

Substances

  • Amino Acids
  • Bacterial Proteins